Cache-Based Matrix Technology for Efficient Write and Recovery in Erasure Coding Distributed File Systems
نویسندگان
چکیده
With the development of various information and communication technologies, amount big data has increased, distributed file systems have emerged to store them stably. The replication technique divides original into blocks writes on multiple servers for redundancy fault tolerance. However, there is a symmetrical space efficiency problem that arises from need larger than data. When storing data, Erasure Coding (EC) generates parity through encoding calculations separately each server tolerance recovery purposes. Even if specific fails, can still be recovered decoding using stored remaining servers. matrices generated during are redundantly writing recovery, which leads unnecessary overhead in systems. This paper proposes cache-based matrix uploads cache memory reuses them, rather generating new time or occurs. design applies Weighting Size Cost Replacement Policy (WSCRP) algorithm efficiently upload reuse parameters known as weights costs. Furthermore, table managed because weight–cost model sorts updates parameters, reduces replacement cost. experiment utilized Hadoop Distributed File System (HDFS) system, EC volume was composed Reed–Solomon code with (6, 3). As result experiment, it possible reduce write, read, times associated decoding. In particular, up three node failures, WSCRP were able by about 30 s compared regular HDFS
منابع مشابه
Cache based fault recovery for distributed systems
No cache based techniques for roll forward fault re covery exist at present A split cache approach is pro posed that provides e cient support for checkpointing and roll forward fault recovery in distributed systems This approach obviates the use of discrete stable stor age or explicit synchronization among the processors Stability of the checkpoint intervals is used as a driver for real time op...
متن کاملWrite Caching in Distributed File Systems
Disk caches are employed in distributed le systems to avoid network accesses at clients and to compensate for the speed diierential between main memory and disk at le servers. Because of concerns about volatility, however, write requests have typically not beneetted from the presence of caches. Instead, they have been processed with some sort of write-through or periodic write-back approach to ...
متن کاملExtending DIRAC File Management with Erasure-Coding for efficient storage
The state of the art in Grid style data management is to achieve increased resilience of data via multiple complete replicas of data files across multiple storage endpoints. While this is effective, it is not the most space-efficient approach to resilience, especially when the reliability of individual storage endpoints is sufficiently high that only a few will be inactive at any point in time....
متن کاملErasure Coding in Distributed Storage Systems
Data centers are nowadays highly distributed, possibly over continents, to cope with the huge amounts of data collected these days. To deal with data loss in an environment where hardware failures are very common, researchers have developed new kinds of erasure codes. Erasure coding is a common technique used in all sorts of digital communication, including deep space communication and QR-codes...
متن کاملDistributed File System Based on Erasure Coding for I/O-Intensive Applications
Distributed storage systems take advantage of the network, storage and computational resources to provide a scalable infrastructure. But in such large system, failures are frequent and expected. Data replication is the common technique to provide fault-tolerance but suffers from its important storage consumption. Erasure coding is an alternative that offers the same data protection but reduces ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2023
ISSN: ['0865-4824', '2226-1877']
DOI: https://doi.org/10.3390/sym15040872